Simulation of Real-world Event Repositories for Evaluation of Data Analytics Solutions: Case of User Behavior Pattern Recovery
نویسندگان
چکیده
Due to the lack of access to the real-world event-log repositories in critical domains such as healthcare and banking, the evaluation and maintenance of data analytics algorithms has become a challenge. Generating synthetic log repositories that simulate a variety of complex real-world event-log repositories will be an effective way of producing benchmarks to evaluate data analytics algorithms using information retrieval metrics. As an important case study for such synthetic log repository, we populate an event-log repository with complex user-behavior instances in the healthcare domain, where the behavior is defined as a sequence of events by a user. Since user behavior has a complex nature, we defined a user-behavior pattern language (BPL) that allows the domain experts to represent both the desired behavior patterns that is used by the log generator engine, and for defining a target user-behavior pattern to be searched in the generated log repository. We use constraint-based approximate event-pattern matching techniques to search and identify the instances of the target pattern in the repository. In this paper, we introduce our BPL, present our event-log generator engine and the produced log repository, and use our pattern matching algorithm to identify the extracted user behaviours in the repository to show the practicality and usefulness of our proposed framework.
منابع مشابه
Defining Robust Recovery Solutions for Preserving Service Quality during Rail/Metro Systems Failure
In this paper, we propose a sensitivity analysis for evaluating the effectiveness of recovery solutions in the case of disturbed rail operations. Indeed, when failures or breakdowns occur during daily service, new strategies have to be implemented so as to react appropriately and re-establish ordinary conditions as rapidly as possible. In this context, the use of rail simulation is vital: for e...
متن کاملHybrid Method of Logistic Regression and Data Envelopment Analysis for Event Prediction: A Case Study (Stroke Disease)
Abstract Predictive analytics is an area of statistics that deals with extracting information from data and using it to predict trends and behavior patterns. Many mathematical modeling has been developed and used for prediction, and in some cases, they have been found to be very strong and reliable. This paper studies different mathematical and statistical approaches for events prediction. The ...
متن کاملSimulation of rainfall temporal distribution pattern using WRF Model (case study of Parsian dam basin)
During the rainfall, the intensity of precipitation varies. Changes in the amount of precipitation during an event of rainfall are effective in the resulting of flood and its intensity. Knowledge of how rainfall changes over time during rainfall is determined by temporal distribution pattern of rainfall. For this purpose, availability of short-term time scales rainfalls data are important that ...
متن کاملAn Investigation on the User Behavior in Social Commerce Platforms: A Text Analytics Approach
Nowadays, the tourism industry accounts for approximately 10% of the global GDP, while it only contributes 3% of the economy in Iran. Since the pressure of US sanctions increases day after day on the Iranian economy, the necessity of paying attention to this industry as a source of foreign currency is felt more than ever. The purpose of this research is to analyze the reviews of users of social...
متن کاملA Simulation Approach to Evaluate Performance Indices of Fuzzy Exponential Queuing System (An M/M/C Model in a Banking Case Study)
This paper includes a simulation model built in order to predict the performance indicessuch aswaiting time by analyzing queue’s components in the real world under uncertain and subjective situation. The objective of this paper is to predict the waiting time of each customer in an M/M/C queuing model. In this regard, to enable decision makers to obtain useful results with enough knowledge on th...
متن کامل